23 research outputs found

    Towards expert-inspired automatic criterion to cut a dendrogram for real-industrial applications

    Get PDF
    Hierarchical clustering is one of the most preferred choices to understand the underlying structure of a dataset and defining typologies, with multiple applications in real life. Among the existing clustering algorithms, the hierarchical family is one of the most popular, as it permits to understand the inner structure of the dataset and find the number of clusters as an output, unlike popular methods, like k-means. One can adjust the granularity of final clustering to the goals of the analysis themselves. The number of clusters in a hierarchical method relies on the analysis of the resulting dendrogram itself. Experts have criteria to visually inspect the dendrogram and determine the number of clusters. Finding automatic criteria to imitate experts in this task is still an open problem. But, dependence on the expert to cut the tree represents a limitation in real applications like the fields industry 4.0 and additive manufacturing. This paper analyses several cluster validity indexes in the context of determining the suitable number of clusters in hierarchical clustering. A new Cluster Validity Index (CVI) is proposed such that it properly catches the implicit criteria used by experts when analyzing dendrograms. The proposal has been applied on a range of datasets and validated against experts ground-truth overcoming the results obtained by the State of the Art and also significantly reduces the computational cost .Peer ReviewedPostprint (published version

    Bootstrap–CURE: A novel clustering approach for sensor data: an application to 3D printing industry

    Get PDF
    The agenda of Industry 4.0 highlights smart manufacturing by making machines smart enough to make data-driven decisions. Large-scale 3D printers, being one of the important pillars in Industry 4.0, are equipped with smart sensors to continuously monitor print processes and make automated decisions. One of the biggest challenges in decision autonomy is to consume data quickly along the process and extract knowledge from the printer, suitable for improving the printing process. This paper presents the innovative unsupervised learning approach, bootstrap–CURE, to decode the sensor patterns and operation modes of 3D printers by analyzing multivariate sensor data. An automatic technique to detect the suitable number of clusters using the dendrogram is developed. The proposed methodology is scalable and significantly reduces computational cost as compared to classical CURE. A distinct combination of the 3D printer’s sensors is found, and its impact on the printing process is also discussed. A real application is presented to illustrate the performance and usefulness of the proposal. In addition, a new state of the art for sensor data analysis is presented.This work was supported in part by KEMLG-at-IDEAI (UPC) under Grant SGR-2017-574 from the Catalan government.Peer ReviewedPostprint (published version

    A HYBRID INTELLIGENT MODEL FOR TOURISM DEMAND FORECASTING

    Get PDF
    Rast turističke potražnje diljem svijeta dovela je do porasta broja metoda za prognoziranje turističke potražnje. Nove su tehnike polučile pouzdane prognoze turističkih dolazaka s ciljem boljeg ekonomskog planiranja. Ovo istraživanje ima za cilj prognozirati i usporediti djelotvornost dvaju nelinearnih pristupa umjetne inteligencije u predviđanju broja turističkih dolazaka u Singapur. Mjesečni podaci o dolasku turista u Singapur korišteni su za prognoziranje mjesec, dva, četiri i šest mjeseci unaprijed pomoću nelinearnih autoregresivnih (NAR) neuronskih mreža i neuro-fuzzy (neizrazitih) sustava. Točnost predviđanja neuronskih mreža NAR uspoređivala se s onom neuro-fuzzy sustava pomoću različitih mjerenja učinkovitosti. Studija je pokazala da su neuro-fuzzy sustavi učinkovitiji od mreže NAR u svim razdobljima prognoze i kod svih zemalja. Predložena neuro-fuzzy metoda poboljšava učinkovitost prognoziranja tehnika temeljenih na umjetnoj inteligenciji. Ova studija predstavlja doprinos literaturi u području turizma i mogu je koristiti menadžeri za učinkovito planiranje i provođenje mjera u okviru turističke politike.The ever increasing demand of the tourism sector worldwide has led to an increase in tourism demand forecasting methodologies. New techniques yield much reliable predictions of tourist arrivals for better economic planning. The study aims to forecast and compare the performance of two non-linear artificial intelligence approaches in predicting the number of tourist arrivals to Singapore. The Singapore inbound monthly tourism data were utilized to generate one, two, four and six month ahead forecasts with non-linear autoregressive (NAR) neural networks and neuro-fuzzy systems. The predictive accuracy of NAR neural networks and neuro-fuzzy systems were compared with various performance metrics. The study revealed that neuro-fuzzy systems outperformed NAR networks in all forecasting horizons and for all countries. The proposed neuro-fuzzy methodology helps in improving the forecasting performance of artificial intelligence based techniques. The study contributes to hospitality literature and could be utilized by managers to effectively plan and implement tourism related policy measures

    Molecular characterization of bread wheat (Triticum aestivum) genotypes using SSR markers

    Get PDF
    An experiment was conducted during winter (rabi) seasons of 2019–20 and 2020–21 at the research farm of CCS Haryana Agricultural University to study the genetic diversity of 80 bread wheat (Triticum aestivum L.) genotypes, using 43 polymorphic SSR markers. A total of 84 alleles were discovered, with an average of 3 alleles amplified per locus. The average value of the allelic PIC varied from 0.26 to 0.82. Primers, viz. Xgwm 129, Xgwm 131, TaGST, CFA2147, Xwmc48, Xbarc 1165 and Xwmc169 may be deemed particularly informative given their high PIC values. Indices of dissimilarity varied from 0.14 to 0.42. Eighty wheat genotypes were clustered into two main groups with 35 and 45 genotypes each using the dendrogram constructed on the basis of molecular data of polymorphic markers. Using STRUCTURE, genotypes were classified into 4 major sub-populations having Fst values 0.351, 0.363, 0.508 and 0.313, respectively. Future breeding operations in wheat cultivars for tolerance to abiotic stress should consider genotypes clustering into different groups. Assessing the molecular genetic diversity is a reliable approach to identify cultivars by analyzing of specific regions of the cultivars DNA based on their unique genetic profiles

    Determining automatic number of classes in hierarchical clustering by approaching experts criteria through machine learning

    No full text
    Clustering is one of the most popular artificial intelligence techniques which aims at identifying groups of similar objects or patterns in the data. While there are multiple clustering techniques available in the literature, hierarchical clustering remains to be one of the most powerful and preferred choices to unveil the internal structure of the data in the form of a tree. The hierarchical clustering processes provide a dendrogram as the main output, which shows the inner similarities structure of the dataset. Deciding the correct number of clusters emerging from the dendrogram is, however, still an open problem and it remains at the disposal of human expertise in assessing the dendrogram. It is often impractical to assume that a human expert is available and with sufficient domain knowledge or technical skills to correctly determine the right number of clusters. Additionally, the dependency on a human expert also limits the practical applications of eventual automatic uses of hierarchical clustering in real-time scenarios where the objective is to capture the true nature of data that other clustering schemes often fail to do. The human judgment about dendrogram also brings the inherently human nature of variability that might vary from situation to situation, but in general, the expert assessment of dendrograms introduces some extra considerations which overcome the strict evaluation of a utility function, and that might be interesting to catch. Hence, correctly capturing and programming a method to deduce the number of clusters from a dendrogram as experts do becomes tricky.This research investigates how a human expert decides the best cut of the tree and determines the right number of clusters in a hierarchical clustering setting and proposes a new criterion that catches the human hidden criteria in this task. The research involves taking a hundred samples from real-time industrial data and takes assistance from human experts in determining the number of clusters and generating a ground truth for the thesis experimentation. Throughout the research, the Calinski-Harabasz index is used as a baseline cluster validity index being the most suitable metric when hierarchical clustering is used with Ward's linkage criterion and Euclidean distance. Five new criteria have been investigated in the thesis and evaluated over the testing dataset. The proposed criteria based on dendrogram' nodes height shows an excellent match against a human-expert driven number of clusters.The proposed cluster validity index not only overcomes the performance of other\textit{ CVI} existing in the literature to determine the number of clusters but also reduces the computational complexity by avoiding repeated runs of cluster-validity-index like Calinski-Harabasz and using intrinsic information of the dendrograms themselves. The proposed method also fits nicely into the wider research of the frame project by making hierarchical clustering suitable for even large dataset

    COVID-19 Pandemic: What Can We Learn for Better Air Quality and Human Health?

    No full text
    The COVID-19 lockdown resulted in improved air quality in many cities across the world. With the objective of what could be the new learning from the COVID-19 pandemic and subsequent lockdowns for better air quality and human health, a critical synthesis of the available evidence concerning air pollution reduction, the population at risk and natural versus anthropogenic emissions was conducted. Can the new societal norms adopted during pandemics, such as the use of face cover, awareness regarding respiratory hand hygiene, and physical distancing, help in reducing disease burden in the future? The use of masks will be more socially acceptable during the high air pollution episodes in lower and middle-income countries, which could help to reduce air pollution exposure. Although post-pandemic, some air pollution reduction strategies may be affected, such as car-pooling and the use of mass transit systems for commuting to avoid exposure to airborne infections like coronavirus. However, promoting non-motorized modes of transportation such as cycling and walking within cities as currently being enabled in Europe and other countries could overshadow such losses. This demand focus on increasing walkability in a town for all ages and populations, including for a differently-abled community. The study highlighted that for better health and sustainability there. is also a need to promote other measures such as work-from-home, technological infrastructure, the extension of smart cities, and the use of information technology

    Inverted perovskite solar cells with air stable diketopyrrolopyrrole-based electron transport layer

    No full text
    One of the possible causes of degradation of perovskite solar cells is the instability of the electron transporting layer. In this regard, design of air stable electron transport organic semiconductors, compatible with perovskite energy levels presents challenges due to inherent vulnerability to traps, presumably originating due to water and/or oxygen. In this work, we demonstrate air stability of diketopyrrolopyrrole-based molecule (TDPP-CN4) at ambient conditions and its application as electron transporting layer (ETL) in perovskite solar cells. We investigated electron mobility and air stability of TDPP-CN4 by fabricating top-gate bottom-contact (TG-BC) thin film transistors and compared with PCBM at ambient conditions. Both TDPP-CN4 and PCBM exhibit electron transport properties with mobility of 0.13 cm(2) V-1 s(-1) and 0.03 cm(2) V-1 s(-1) respectively. However, we found remarkable air stability of the TDPP-CN4 in the OFET measurements under ambient conditions. These excellent properties of TDPP-CN4 render them as potential ETL layer in inverted planar heterojunction perovskite solar cells. Our preliminary device studies show remarkable short-circuit current (J(sc)) similar to 17.4 mA/cm(2) with moderate open-circuit voltage (V-oc) of 0.50 V. These results suggest that the electron mobility and air-stability of diketopyrrolopyrrole-based molecule hold a promise as Ell in perovskite solar cells at ambient conditions
    corecore